Refined Error Bounds for Several Learning Algorithms

نویسنده

  • Steve Hanneke
چکیده

This article studies the achievable guarantees on the error rates of certain learning algorithms, with particular focus on refining logarithmic factors. Many of the results are based on a general technique for obtaining bounds on the error rates of sample-consistent classifiers with monotonic error regions, in the realizable case. We prove bounds of this type expressed in terms of either the VC dimension or the sample compression size. This general technique also enables us to derive several new bounds on the error rates of general sample-consistent learning algorithms, as well as refined bounds on the label complexity of the CAL active learning algorithm. Additionally, we establish a simple necessary and sufficient condition for the existence of a distribution-free bound on the error rates of all sample-consistent learning rules, converging at a rate inversely proportional to the sample size. We also study learning in the presence of classification noise, deriving a new excess error rate guarantee for general VC classes under Tsybakov’s noise condition, and establishing a simple and general necessary and sufficient condition for the minimax excess risk under bounded noise to converge at a rate inversely proportional to the sample size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Complexity Bounds Random Sets for Pac-learning With

Learnability in Valiant’s pac-learning formalism is reformulated in terms of expected (average) error instead of confidence and error parameters. A finite-domain, random set formalism is introduced to develop algorithm-dependent, distributionspecific analytic error estimates. Two random set theorems for finite concept-spaces are presented to facilitate these developments. Analyses are carried o...

متن کامل

Bounding the Generalization Error of Convex Combinations of Classiiers: Balancing the Dimensionality and the Margins

A problem of bounding the generalization error of a classiier f 2 conv(H); where H is a "base" class of functions (classiiers), is considered. This problem frequently occurs in computer learning, where eecient algorithms of combining simple classiiers into a complex one (such as boosting and bagging) have attracted a lot of attention. Using Talagrand's concentration inequalities for empirical p...

متن کامل

Error Bounds for Transductive Learning via Compression and Clustering

This paper is concerned with transductive learning. Although transduction appears to be an easier task than induction, there have not been many provably useful algorithms and bounds for transduction. We present explicit error bounds for transduction and derive a general technique for devising bounds within this setting. The technique is applied to derive error bounds for compression schemes suc...

متن کامل

Bounds in Terms of Rademacher Averages

So far we have seen how to obtain generalization error bounds for learning algorithms that pick a function from a function class of limited capacity or complexity, where the complexity of the class is measured using the growth function or VC-dimension in the binary case, and using covering numbers or the fatshattering dimension in the real-valued case. These complexity measures however do not t...

متن کامل

Generalization error bounds for learning to rank: Does the length of document lists matter?

We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking. Existing generalization error bounds necessarily degrade as the size of the document list associated with a query increases. We show that such a degradation is not intrinsic to the problem. For several loss functions, including the cross-entropy loss used in the well...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2016